106 research outputs found

    On Autonomic HPC Clouds

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015.The long tail of science using HPC facilities is looking nowadays to instant available HPC Clouds as a viable alternative to the long waiting queues of supercomputing centers. While the name of HPC Cloud is suggesting a Cloud service, the current HPC-as-a-Service is mainly an offer of bar metal, better named cluster-on-demand. The elasticity and virtualization benefits of the Clouds are not exploited by HPC-as-a-Service. In this paper we discuss how the HPC Cloud offer can be improved from a particular point of view, of automation. After a reminder of the characteristics of the Autonomic Cloud, we project the requirements and expectations to what we name Autonomic HPC Clouds. Finally, we point towards the expected results of the latest research and development activities related to the topics that were identified.The work related to Autonomic HPC Clouds is supported by the European Commission under grant agreement H2020-6643946 (CloudLightning). The CLoudLightning project proposal was prepared by eight partner institutions, three of them as earlier partners in the COST Action IC1305 NESUS, benefiting from its inputs for the proposal. The section related to Autonomic Clouds is supported by the Romanian UEFISCDI under grant agreement PN-II-ID-PCE-2011- 3-0260 (AMICAS)

    Perspectives on anomaly and event detection in exascale systems

    Get PDF
    Proceeding of: IEEE 5th International Conference on Big Data Security on Cloud (BigDataSecurity), 27-29 May 2019, Washington, USAThe design and implementation of exascale system is nowadays an important challenge. Such a system is expected to combine HPC with Big Data methods and technologies to allow the execution of scientific workloads which are not tractable at this present time. In this paper we focus on an event and anomaly detection framework which is crucial in giving a global overview of a exascale system (which in turn is necessary for the successful implementation and exploitation of the system). We propose an architecture for such a framework and show how it can be used to handle failures during job execution.This work has received funding from the EC-funded H2020 ASPIDE project (Agreement 801091). This work was supported with hardware resources by the Romanian grant BID (PN-III-P1-PFE-28)

    WebPS: A Web-based P System Simulator with Query Facilities

    Get PDF
    In this paper we present an open-source web-enabled simulator for P sys- tems. We use CLIPS embedded in C, and make the simulator available as a web application, complemented by a query language to specify the results

    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to present their current research, and to discuss topics with other students in order to look for synergies and common research topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management, training, contributing to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience.European Cooperation in Science and Technology. COS

    DEPAS: A Decentralized Probabilistic Algorithm for Auto-Scaling

    Full text link
    The dynamic provisioning of virtualized resources offered by cloud computing infrastructures allows applications deployed in a cloud environment to automatically increase and decrease the amount of used resources. This capability is called auto-scaling and its main purpose is to automatically adjust the scale of the system that is running the application to satisfy the varying workload with minimum resource utilization. The need for auto-scaling is particularly important during workload peaks, in which applications may need to scale up to extremely large-scale systems. Both the research community and the main cloud providers have already developed auto-scaling solutions. However, most research solutions are centralized and not suitable for managing large-scale systems, moreover cloud providers' solutions are bound to the limitations of a specific provider in terms of resource prices, availability, reliability, and connectivity. In this paper we propose DEPAS, a decentralized probabilistic auto-scaling algorithm integrated into a P2P architecture that is cloud provider independent, thus allowing the auto-scaling of services over multiple cloud infrastructures at the same time. Our simulations, which are based on real service traces, show that our approach is capable of: (i) keeping the overall utilization of all the instantiated cloud resources in a target range, (ii) maintaining service response times close to the ones obtained using optimal centralized auto-scaling approaches.Comment: Submitted to Springer Computin

    Self-Healing Distributed Scheduling Platform

    Get PDF
    International audienceDistributed systems require effective mechanisms to manage the reliable provisioning of computational resources from different and distributed providers. Moreover, the dynamic environment that affects the behaviour of such systems and the complexity of these dynamics demand autonomous capabilities to ensure the behaviour of distributed scheduling platforms and to achieve business and user objectives. In this paper we propose a self-adaptive distributed scheduling platform composed of multiple agents implemented as intelligent feedback control loops to support policy-based scheduling and expose self-healing capabilities. Our platform leverages distributed scheduling processes by (i) allowing each provider to maintain its own internal scheduling process, and (ii) implementing self-healing capabilities based on agent module recovery. Simulated tests are performed to determine the optimal number of agents to be used in the negotiation phase without affecting the scheduling cost function. Test results on a real-life platform are presented to evaluate recovery times and optimize platform parameters

    New directions in mobile, hybrid, and heterogeneous clouds for cyberinfrastructures

    Get PDF
    With the increasing availability of mobile devices and data generated by end-users, scientific instruments and simulations solving many of our most important scientific and engineering problems require innovative technical solutions. These solutions should provide the whole chain to process data and services from the mobile users to the cloud infrastructure, which must also integrate heterogeneous clouds to provide availability, scalability, and data privacy. This special issue presents the results of particular research works showing advances on mobile, hybrid, and heterogeneous clouds for modern cyberinfrastructures

    Symbolic Computations based on Grid Services

    Get PDF
    The widespread adoption of the current Grid technologies is still impeded by a number of problems, one of which is difficulty of developing and implementing Grid-enabled applications. In another dimension, symbolic computation, aiming to automatize the steps of mathematical problem solving, has become in the last years a basis for advanced applications in many areas of computer science. In this context we have recently analyzed and developed grid-extensions of known tools for symbolic computations. We further present in this paper a case study of a Web service-based Grid application for symbolic computations
    • 

    corecore